Xi-Vector Embedding for Speaker Recognition
نویسندگان
چکیده
We present a Bayesian formulation for deep speaker embedding, wherein the xi-vector is counterpart of x-vector, taking into account uncertainty estimate. On technology front, we offer simple and straightforward extension to now widely used x-vector. It consists an auxiliary neural net predicting frame-wise input sequence. show that proposed leads substantial improvement across all operating points, with significant reduction in error rates detection cost. theoretical our proposal integrates linear Gaussian model speaker-embedding networks via pooling layer. In one sense, i-vector Hence, refer embedding as xi-vector, which pronounced /zai/ vector. Experimental results on SITW evaluation set consistent over 17.5% equal-error-rate 10.9% minimum
منابع مشابه
Graph-embedding for speaker recognition
Popular methods for speaker classification perform speaker comparison in a high-dimensional space [1], however, recent work [2] has shown that most of the speaker variability is captured by a low-dimensional subspace of that space. In this paper we examine whether additional structure in terms of nonlinear manifolds exist within the high-dimensional space. We will use graph embedding [3] as a p...
متن کاملPhonetic Speaker Recognition with Support Vector Machines
A recent area of significant progress in speaker recognition is the use of high level features—idiolect, phonetic relations, prosody, discourse structure, etc. A speaker not only has a distinctive acoustic sound but uses language in a characteristic manner. Large corpora of speech data available in recent years allow experimentation with long term statistics of phone patterns, word patterns, et...
متن کاملA Vector Quantization Approach to Speaker Recognition
CH2118-8/85/0000-0387 $1.00 © 1985 IEEE 387 ABSTRACT. In this study a vector quantIzation (VQ) codebook was system. In the other, Shore and Burton 112] used word-based VQ used as an efficient means of characterizing the short-time spectral codebooks and reported good performance in speaker-trained isolatedfeatures of a speaker. A set of such codebooks were then used to word recognition experime...
متن کاملAutomatic Speaker Recognition Using Fuzzy Vector Quantization
Speaker recognition (SR) is a dynamic biometric task. SR is a multidisplinary problem that encompasses many aspects of human speech, including speech recognition, language recognition, and speech accents. This technique makes it possible to use the speaker’s voice to verify his/her identity and provide controlled access to services. The Mel-frequency extraction method is leading approach for sp...
متن کاملUnsupervised Domain Adaptation for I-vector Speaker Recognition
In this paper, we present a framework for unsupervised domain adaptation of PLDA based i-vector speaker recognition systems. Given an existing out-of-domain PLDA system, we use it to cluster unlabeled in-domain data, and then use this data to adapt the parameters of the PLDA system. We explore two versions of agglomerative hierarchical clustering that use the PLDA system. We also study two auto...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE Signal Processing Letters
سال: 2021
ISSN: ['1558-2361', '1070-9908']
DOI: https://doi.org/10.1109/lsp.2021.3091932